Adaptive Geometric Multiscale Approximations for Intrinsically Low-dimensional Data
نویسندگان
چکیده
We consider the problem of efficiently approximating and encoding high-dimensional data sampled from a probability distribution ρ in R, that is nearly supported on a d-dimensional setM for example supported on a d-dimensional Riemannian manifold. Geometric MultiResolution Analysis (GMRA) provides a robust and computationally efficient procedure to construct low-dimensional geometric approximations of M at varying resolutions. We introduce a thresholding algorithm on the geometric wavelet coefficients, leading to what we call adaptive GMRA approximations. We show that these data-driven, empirical approximations perform well, when the threshold is chosen as a suitable universal function of the number of samples n, on a wide variety of measures ρ, that are allowed to exhibit different regularity at different scales and locations, thereby efficiently encoding data from more complex measures than those supported on manifolds. These approximations yield a data-driven dictionary, together with a fast transform mapping data to coefficients, and an inverse of such a map. The algorithms for both the dictionary construction and the transforms have complexity Cn log n with the constant linear in D and exponential in d. Our work therefore establishes adaptive GMRA as a fast dictionary learning algorithm with approximation guarantees. We include several numerical experiments on both synthetic and real data, confirming our theoretical results and demonstrating the effectiveness of adaptive GMRA.
منابع مشابه
Multiscale Geometric Dictionaries for Point-cloud Data
We develop a novel geometric multiresolution analysis for analyzing intrinsically low dimensional point clouds in high-dimensional spaces, modeled as samples from a d-dimensional set M (in particular, a manifold) embedded in R, in the regime d D. This type of situation has been recognized as important in various applications, such as the analysis of sounds, images, and gene arrays. In this pape...
متن کاملMulti-Resolution Geometric Analysis for Data in High Dimensions
Large data sets arise in a wide variety of applications and are often modeled as samples from a probability distribution in high-dimensional space. It is sometimes assumed that the support of such probability distribution is well approximated by a set of low intrinsic dimension, perhaps even a lowdimensional smooth manifold. Samples are often corrupted by high-dimensional noise. We are interest...
متن کاملSome recent advances in multiscale geometric analysis of point clouds
We discuss recent work based on multiscale geometric analysis for the study of large data sets that lie in high-dimensional spaces but have low-dimensional structure. We present three applications: the first one to the estimation of intrinsic dimension of sampled manifolds, the second one to the construction of multiscale dictionaries, called geometric wavelets, for the analysis of point clouds...
متن کاملMultiscale Strategies for Computing Optimal Transport
This paper presents a multiscale approach to efficiently compute approximate optimal transport plans between point sets. It is particularly well-suited for point sets that are in high-dimensions, but are close to being intrinsically low-dimensional. The approach is based on an adaptive multiscale decomposition of the point sets. The multiscale decomposition yields a sequence of optimal transpor...
متن کاملHigh-Dimensional Menger-Type Curvatures - Part I: Geometric Multipoles and Multiscale Inequalities
We define discrete Menger-type curvature of d+2 points in a real separable Hilbert space H by an appropriate scaling of the squared volume of the corresponding (d+1)-simplex. We then form a continuous curvature of an Ahlfors regular measure μ on H by integrating the discrete curvature according to products of μ (or its restriction to balls). The essence of this work, which continues in a subseq...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1611.01179 شماره
صفحات -
تاریخ انتشار 2016